Skip to content

SqliteStore backend + annotation, audit, and query result cache tools#169

Draft
data-douser wants to merge 4 commits intomainfrom
dd/sqlite-annotation-cache
Draft

SqliteStore backend + annotation, audit, and query result cache tools#169
data-douser wants to merge 4 commits intomainfrom
dd/sqlite-annotation-cache

Conversation

@data-douser
Copy link
Collaborator

Summary

Replace lowdb with sql.js (asm.js build) as a zero-dependency SQLite persistence backend. Add 14 new MCP tools for annotation management, audit workflows, and query result caching. Refactor server code into smaller, focused modules.

Changes

SqliteStore persistence backend

  • server/src/lib/sqlite-store.ts — unified SQLite persistence with 3 tables:
    • sessions: session tracking (migrated from lowdb)
    • annotations: key-value annotation store with categories and full-text search
    • query_result_cache: BQRS/SARIF result caching with subset retrieval (line ranges, grep, SARIF filters)
  • server/src/types/sql-js.d.ts — type declarations for sql.js
  • server/package.json — added sql.js, removed lowdb

New tools (gated by ENABLE_ANNOTATION_TOOLS env var)

  • Annotation tools (6): annotation_create, annotation_list, annotation_search, annotation_delete
  • Audit tools (4): audit_store_findings, audit_list_findings, audit_add_notes, audit_clear_repo
  • Cache tools (4): query_results_cache_lookup, query_results_cache_retrieve, query_results_cache_clear, query_results_cache_compare

Code refactoring

  • Extracted database-resolver.ts from cli-tool-registry.ts
  • Extracted query-resolver.ts from cli-tool-registry.ts
  • Extracted result-processor.ts from cli-tool-registry.ts
  • Extracted codeql-version.ts from cli-executor.ts
  • cli-tool-registry.ts reduced from ~1060 to ~693 lines

Bug fixes

  • Fix params.output not propagated to processQueryRunResults (caused cache writes to silently skip)
  • Fix duplicate SARIF generation bypassing auto-cache pipeline
  • Fix external predicate conditions for direct query paths (CallGraphFrom/To/FromTo)

Testing

  • All 1074 server unit tests passing
  • New test: sqlite-store.test.ts (31 tests)
  • Updated: session-data-manager.test.ts, monitoring-tools.test.ts

Review order

This PR is independent and should be reviewed first (other PRs depend on it).

Closes #165
Part of #163

Replace lowdb with sql.js (asm.js build) for zero-dependency SQLite persistence.
Bundle inline with esbuild — no native modules, no external deps at runtime.

SqliteStore provides three tables:
- sessions: session tracking (migrated from lowdb)
- annotations: key-value annotation store with categories and metadata
- query_result_cache: BQRS/SARIF result caching with subset retrieval

New tools (gated by ENABLE_ANNOTATION_TOOLS env var):
- annotation_create, annotation_list, annotation_search, annotation_delete
- audit_store_findings, audit_list_findings, audit_add_notes, audit_clear_repo
- query_results_cache_lookup, query_results_cache_retrieve,
  query_results_cache_clear, query_results_cache_compare

Code refactoring for maintainability:
- Extract database-resolver.ts from cli-tool-registry.ts
- Extract query-resolver.ts from cli-tool-registry.ts
- Extract result-processor.ts from cli-tool-registry.ts
- Extract codeql-version.ts from cli-executor.ts

Bug fixes:
- Fix params.output not propagated to proce- Fix params.output not propagated to proce- Fix params.output not propagated txternal predicate conditions for direct query paths

Closes #165
Copilot AI review requested due to automatic review settings March 25, 2026 11:53
@github-actions
Copy link
Contributor

github-actions bot commented Mar 25, 2026

Dependency Review

The following issues were found:
  • ✅ 0 vulnerable package(s)
  • ✅ 0 package(s) with incompatible licenses
  • ✅ 0 package(s) with invalid SPDX license definitions
  • ⚠️ 1 package(s) with unknown licenses.
See the Details below.

Snapshot Warnings

⚠️: No snapshots were found for the head SHA 99be0ca.
Ensure that dependencies are being submitted on PR branches and consider enabling retry-on-snapshot-warnings. See the documentation for more information and troubleshooting advice.

License Issues

server/package.json

PackageVersionLicenseIssue Type
sql.js^1.14.1NullUnknown License

OpenSSF Scorecard

PackageVersionScoreDetails
npm/sql.js 1.14.1 🟢 3.4
Details
CheckScoreReason
Code-Review⚠️ 2Found 6/30 approved changesets -- score normalized to 2
Dangerous-Workflow🟢 10no dangerous workflow patterns detected
Packaging⚠️ -1packaging workflow not detected
Maintained🟢 43 commit(s) and 2 issue activity found in the last 90 days -- score normalized to 4
Token-Permissions⚠️ 0detected GitHub workflow tokens with excessive permissions
Binary-Artifacts🟢 10no binaries found in the repo
Pinned-Dependencies🟢 6dependency not pinned by hash detected -- score normalized to 6
CII-Best-Practices⚠️ 0no effort to earn an OpenSSF best practices badge detected
Security-Policy⚠️ 0security policy file not detected
Fuzzing⚠️ 0project is not fuzzed
License🟢 9license file detected
Branch-Protection⚠️ 0branch protection not enabled on development/release branches
Signed-Releases⚠️ 0Project has not signed or included provenance with any releases.
SAST⚠️ 0SAST tool is not run on all commits -- score normalized to 0
npm/sql.js ^1.14.1 UnknownUnknown

Scanned Files

  • package-lock.json
  • server/package.json

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR migrates the server’s persistence layer from lowdb to a unified sql.js-backed SqliteStore, and introduces opt-in MCP tools for annotations, audit workflows, and query result caching while refactoring CLI-related logic into smaller modules.

Changes:

  • Replace lowdb session persistence with SqliteStore (sessions + annotations + query result cache tables).
  • Add new opt-in MCP tools: annotation_*, audit_*, and query_results_cache_*.
  • Refactor query/database resolution and query-result processing (interpretation + auto-caching) into dedicated modules.

Reviewed changes

Copilot reviewed 18 out of 20 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
server/test/src/tools/monitoring-tools.test.ts Updates monitoring-tool tests to account for async store initialization and new config flag.
server/test/src/lib/sqlite-store.test.ts Adds unit tests for SqliteStore sessions, annotations, persistence, and cache behaviors.
server/test/src/lib/session-data-manager.test.ts Updates tests to await async initialization after migrating persistence backend.
server/src/types/sql-js.d.ts Adds minimal typings for sql.js asm.js import + core DB surface.
server/src/types/monitoring.ts Adds enableAnnotationTools config flag (default false).
server/src/tools/cache-tools.ts Introduces query_results_cache_* tools (lookup/retrieve/clear/compare).
server/src/tools/audit-tools.ts Introduces audit_* tools layered on annotations.
server/src/tools/annotation-tools.ts Introduces annotation_* tools for CRUD + search.
server/src/lib/sqlite-store.ts Implements the new unified SQLite persistence backend + cache subset retrieval.
server/src/lib/session-data-manager.ts Migrates session persistence from lowdb to SqliteStore; adds getStore().
server/src/lib/result-processor.ts Extracts query result interpretation/evaluation and auto-cache pipeline.
server/src/lib/query-results-evaluator.ts Adds query metadata caching with mtime-based invalidation.
server/src/lib/query-resolver.ts Extracts query-path resolution via codeql resolve queries.
server/src/lib/database-resolver.ts Extracts database-path resolution and caches results in-memory.
server/src/lib/codeql-version.ts Adds target/actual CodeQL version tracking and mismatch warning.
server/src/lib/cli-tool-registry.ts Integrates extracted resolvers/result processor and fixes output propagation + predicate handling.
server/src/lib/cli-executor.ts Wires version detection into startup validation; re-exports version helpers.
server/src/codeql-development-mcp-server.ts Registers new annotation/audit/cache tools at startup and initializes session manager.
server/package.json Removes lowdb, adds sql.js.
package-lock.json Updates lockfile to reflect dependency swap (remove lowdb/steno, add sql.js).
Comments suppressed due to low confidence (2)

server/src/lib/sqlite-store.ts:425

  • updateAnnotation sets updated_at = datetime('now'), which produces a different string format than the ISO timestamps written in createAnnotation. This can cause incorrect ordering when sorting by updated_at. Consider updating this to use the same format as inserts.
    setClauses.push("updated_at = datetime('now')");

    const db = this.ensureDb();
    db.run(
      `UPDATE annotations SET ${setClauses.join(', ')} WHERE id = $id`,
      params as Record<string, string | number | null>,
    );

server/src/lib/session-data-manager.ts:49

  • Docstring says “sql.js WASM init is async”, but this store uses the asm.js build. Please adjust the wording so it’s accurate (async init is still true, but it’s not specifically WASM).
  /**
   * Initialize the database and ensure it's properly set up.
   * Must be awaited before any session operations (sql.js WASM init is async).
   */

- Fix TOCTOU in query-results-evaluator (openSync/fstatSync/readFileSync(fd))
- Use datetime('now') consistently for annotation timestamps
- Debounce flush() with 200ms coalescing via scheduleFlush()
- Fix resultIndices to inclusive [start, end] range with clamping
- Fix WASM→asm.js comments in session-data-manager
- Fix audit-tools header comment (no separate ENABLE_AUDIT_TOOLS flag)
- Add separate maxResults parameter for SARIF in cache-tools
- Use createProjectTempDir() in all test files
- Fix monitoring-tools test init order (mock before initialize)
- Add store.close() in session-data-manager test afterEach
Copilot AI review requested due to automatic review settings March 27, 2026 01:11
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 19 out of 23 changed files in this pull request and generated 6 comments.

- Bound metadataCache to 256 entries with oldest-first eviction
- Make SqliteStore.initialize() idempotent (close existing db first)
- Fix TOCTOU in initialize(): try readFileSync directly instead of existsSync
- Atomic flush: write to temp file + renameSync to prevent corruption
- Clarify annotation_search uses substring LIKE matching, not FTS
- Close store in monitoring-tools test afterEach to prevent timer leaks
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 19 out of 23 changed files in this pull request and generated 5 comments.

Comment on lines +52 to 61
// Open once, then fstat + read via the fd to avoid TOCTOU race (CWE-367).
const fd = openSync(queryPath, 'r');
const mtime = fstatSync(fd).mtimeMs;
const cached = metadataCache.get(queryPath);
if (cached && cached.mtime === mtime) {
return cached.metadata;
}

const queryContent = readFileSync(fd, 'utf-8');
const metadata: QueryMetadata = {};
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extractQueryMetadata opens the query file with openSync() but never closes the file descriptor. This leaks FDs and can eventually break query processing under load; it also leaks on the cache-hit early return. Ensure the fd is always closed (e.g., try/finally with closeSync(fd)), including the cached-return path.

Copilot uses AI. Check for mistakes.
Comment on lines +7 to 9
import { fstatSync, openSync, readFileSync, writeFileSync } from 'fs';
import { dirname, isAbsolute } from 'path';
import { mkdirSync } from 'fs';
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two separate imports from 'fs' (import { ... } from 'fs' plus import { mkdirSync } from 'fs'). This is redundant and makes it easier for the import list to drift. Consolidate into a single fs import.

Suggested change
import { fstatSync, openSync, readFileSync, writeFileSync } from 'fs';
import { dirname, isAbsolute } from 'path';
import { mkdirSync } from 'fs';
import { fstatSync, mkdirSync, openSync, readFileSync, writeFileSync } from 'fs';
import { dirname, isAbsolute } from 'path';

Copilot uses AI. Check for mistakes.
Comment on lines +208 to +213
const data = this.db.export();
const buffer = Buffer.from(data);
const tmpPath = this.dbPath + '.tmp';
writeFileSync(tmpPath, buffer);
renameSync(tmpPath, this.dbPath);
this.dirty = false;
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

flush() writes to a temp file and then calls renameSync(tmpPath, this.dbPath). On Windows, fs.renameSync fails if the destination already exists, which would break persistence after the first flush. Consider deleting/overwriting the existing DB file first, or switching to a cross-platform atomic-replace approach.

Copilot uses AI. Check for mistakes.
Comment on lines +385 to +388
if (filter?.search) {
conditions.push('(content LIKE $search OR metadata LIKE $search OR label LIKE $search)');
params.$search = '%' + filter.search + '%';
}
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description/issue text calls out “full-text search” for annotations, but the implementation here uses LIKE substring matching. Either update the PR description/docs to reflect substring search, or implement an actual SQLite FTS table/virtual table for annotations to support true full-text search semantics.

Copilot uses AI. Check for mistakes.
Comment on lines +113 to +121
content: [{
type: 'text' as const,
text: JSON.stringify({
totalResults: subset.totalResults,
returnedResults: subset.returnedResults,
truncated: subset.truncated,
results: JSON.parse(subset.content),
}, null, 2),
}],
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

query_results_cache_retrieve assumes getCacheSarifSubset() always returns JSON and unconditionally does JSON.parse(subset.content). But SqliteStore.getCacheSarifSubset() falls back to returning plain-text content when the cached SARIF is invalid/unparseable. In that case this handler will throw and fail the tool call. Handle the non-JSON fallback (e.g., catch JSON.parse errors and return raw text, or ensure getCacheSarifSubset() always returns valid JSON).

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SqliteStore backend + annotation, audit, and query result cache tools

2 participants